DRO: Deep Recurrent Optimizer for Video to Depth

نویسندگان

چکیده

There are increasing interests of studying the video-to-depth (V2D) problem with machine learning techniques. While earlier methods directly learn a mapping from images to depth maps and camera poses, more recent works enforce multi-view geometry constraints through optimization embedded in framework. This paper presents novel method based on recurrent neural networks further exploit potential V2D. Specifically, our optimizer alternately updates poses iterations minimize feature-metric cost, two gated units iteratively improve results by tracing historical information. Extensive experimental demonstrate that outperforms previous is efficient computation memory consumption than cost-volume-based methods. In particular, self-supervised supervised KITTI ScanNet datasets. Our source code will be made public.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Deep Recurrent Architecture for Video Understanding

This paper 1 introduces the system we developed for the Youtube-8M Video Understanding Challenge, in which a large-scale benchmark dataset [1] was used for multilabel video classification. The proposed framework contains hierarchical deep architecture, including the framelevel sequence modeling part and the video-level classification part. In the frame-level sequence modelling part, we explore ...

متن کامل

Recurrent Temporal Deep Field for Semantic Video Labeling—Supplementary Material

1 Derivation of Mean-field Updating Equations In the following we present a detailed derivation of the mean-field inference algorithm which is explained in Sec. 4 in [2] (i.e., Inference of RTDF). Here, we use the same notation as in [2]. The Kullback-Leibler divergence betweenQ(y,h;μ,ν) and P (y,h|y, I) is defined as KL(Q||P ) = ∑ yt,ht Q(y,h;μ,ν) ln Q(y,h;μ,ν) P (yt,ht|y<t, It) = −H(Q)− ∑ yt,...

متن کامل

Recurrent Temporal Deep Field for Semantic Video Labeling

This paper specifies a new deep architecture, called Recurrent Temporal Deep Field (RTDF), for semantic video labeling. RTDF is a conditional random field (CRF) that combines a deconvolution neural network (DeconvNet) and a recurrent temporal restricted Boltzmann machine (RTRBM). DeconvNet is grounded onto pixels of a new frame for estimating the unary potential of the CRF. RTRBM estimates a hi...

متن کامل

Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks

Progress in deep learning is slowed by the days or weeks it takes to train large models. The natural solution of using more hardware is limited by diminishing returns, and leads to inefficient use of additional resources. In this paper, we present a large batch, stochastic optimization algorithm that is both faster than widely used algorithms for fixed amounts of computation, and also scales up...

متن کامل

Neumann Optimizer: a Practical Optimization Algorithm for Deep Neural Networks

Progress in deep learning is slowed by the days or weeks it takes to train large models. The natural solution of using more hardware is limited by diminishing returns, and leads to inefficient use of additional resources. In this paper, we present a large batch, stochastic optimization algorithm that is both faster than widely used algorithms for fixed amounts of computation, and also scales up...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE robotics and automation letters

سال: 2023

ISSN: ['2377-3766']

DOI: https://doi.org/10.1109/lra.2023.3260724